使用Dockerfile创建Ubuntu+Pytorch+CUDA 镜像

使用Dockerfile创建Ubuntu+Pytorch+CUDA 镜像

过程

  1. 安装Docker 参考ubuntu安装docker
  2. 安装NVIDIA Container Toolkit 参考NVIDIA/nvidia-docker
  3. 准备好Python-3.6.9.tar.xz
  4. 从nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04 镜像基础上搭建
  5. 安装openssh-server、python、pytorch
  6. run镜像时加上参数—gpus all —ipc=host

    Dockerfile

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    # BASE IMAGE
    FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

    # LABEL MAINTAINER
    LABEL maintainer="ltobenull@gmail.com"

    SHELL ["/bin/bash","-c"]

    WORKDIR /tmp
    # copy安装文件
    COPY Python-3.6.9.tar.xz /tmp
    # 设置 root 密码
    RUN echo 'root:password' | chpasswd \
    # 安装openssh-server 并配置
    && apt-get update && apt-get -y install openssh-server \
    && sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config \
    && sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/g' /etc/ssh/sshd_config \
    && mkdir /var/run/sshd \
    # 安装python依赖包
    && apt-get -y install build-essential python-dev python-setuptools python-pip python-smbus \
    && apt-get -y install build-essential libncursesw5-dev libgdbm-dev libc6-dev \
    && apt-get -y install zlib1g-dev libsqlite3-dev tk-dev \
    && apt-get -y install libssl-dev openssl \
    && apt-get -y install libffi-dev \
    # 安装python 3.6.9
    && mkdir -p /usr/local/python3.6 \
    && tar xvf Python-3.6.9.tar.xz \
    && cd Python-3.6.9 \
    && ./configure --prefix=/usr/local/python3.6 \
    && make altinstall \
    # 建立软链接
    && ln -snf /usr/local/python3.6/bin/python3.6 /usr/bin/python3 \
    && ln -snf /usr/local/python3.6/bin/pip3.6 /usr/bin/pip3\
    # 安装pytorch
    && mkdir ~/.pip && echo -e '[global] \nindex-url = https://mirrors.aliyun.com/pypi/simple/' >> ~/.pip/pip.conf \
    && pip3 install torch===1.2.0 torchvision===0.4.0 -f https://download.pytorch.org/whl/torch_stable.html \
    # 清理copy的安装文件
    && apt-get clean \
    && rm -rf /tmp/* /var/tmp/*

    EXPOSE 22

    CMD ["/usr/sbin/sshd", "-D"]